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(54) INTERFACE DEVICE 

(57) The invention relates to an interface apparatus 
for making input and output of appliances having display 
such as computer, word processor, information appli- 
ance and television, comprising recognizing means for 
recognizing the shape or move of the hand of an opera- 
tor, display means for displaying the features of the 
shape or move of the hand recognized by the recogniz- 
ing means as special shape in the screen, and control 
means for controlling the information displayed in the 
screen by the special shape displayed in the screen by 
the display means, wherein the two-dimensional or 
three- dimensional information displayed in the screen 
can be selected, indicated or moved only by changing 
the shape or moving the hand, sothat the interface 
apparatus of very excellent controllability and high 
diversity may be presented. 
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Description 

BACKGROUND OF THE INVENTION 

The present invention relates to an interface appa- 
ratus for input and output of information apparatus such 
as computer and word processor and appliance having 
a display such as television. 

In a kind of conventional interface apparatus, it is 
designed to display a cursor at a coordinate position 
detected by the mouse on a display screen, for adding 
some other information to the information in the display 
device, or changing or selecting the displayed informa- 
tion. 

Fig. 30 shows an outline of this conventional inter- 
face apparatus. In Fig. 30, reference numeral 501 
denotes a host computer, and 502 is a display, and vir- 
tual operation buttons 503, 504, 505 are displayed in the 
display 502 by the host computer 501. Reference 
numeral 506 represents a mouse cursor, and the host 
computer 501 controls the display so as to move in the 
screen in synchronism with the move of the mouse 507, 
on the basis of the moving distance of the mouse 507 
detected by the mouse 507. As the user moves the 
mouse 507, the mouse cursor 506 is moved to the posi- 
tion of a desired virtual operation button in the display 
screen, and by pressing a switch 508 on the mouse 507, 
an operation button is selected so as to instruct action to 
the host computer 501 . 

In this conventional construction, however, the 
mouse or the input device is necessary in addition to the 
main body of the appliance, and a table or area for 
manipulating the mouse is also needed, which is not 
suited to portable information appliance or the like. 
Besides, by manipulation through the mouse, it is not a 
direct and intuitive interface. 

SUMMARY OF THE INVENTION 

It is an object of the invention to present an inter- 
face apparatus capable of manipulating an appliance 
easily without requiring input device such as keyboard 
and mouse. It is other object thereof to present an inter- 
face apparatus further advanced in the ease of manipu- 
lation of indicating or catching the display object by 
judging interactions along the intent of the operator 
sequentially and automatically. 

In structure, the invention provides an interface 
apparatus comprising recognizing means for recogniz- 
ing the shape of a hand of an operator, display means 
for displaying the features of the shape of the hand rec- 
ognized by the recognizing means on the screen as a 
special shape, and control means for controlling the 
information displayed in the screen by the special shape 
displayed in the screen by the display means, whereby 55 
the information displayed in the screen can be control- 
led only by varying the shape of the hand. 

It is a further object to present an interface appara- 



tus much superior in ease of manipulation by recogniz- 
ing also the move of the hand. To recognize the move, a 
frame memory for saving the image picking up the 
shape or move of the hand, and a reference image 
5 memory for storing the image taken before the image 
saved in the frame memory as reference image are pro- 
vided, and it is achieved by depicting the difference 
between the image in the frame memory and the refer- 
ence image stored in the reference image memory. In 
10 other method of recognition, the shape or move of the 
hand of the user in the taken image is depicted as the 
contour of the user, and its contour is traced, and the 
relation between the angle of the contour line and the 
length of contour line, that is, the contour waveform is 
15 calculated and filtered, and the shape waveform 
expressing the specified shape is generated. 

Moreover, comprising cursor display means for dis- 
playing a feature of the shape of a hand on the screen 
as a special shape and manipulating as cursor, means 
20 for storing the relation with display object other than cur- 
sor displays the coordinates and shape of the repre- 
sentative point representing the position of the display 
object other than cursor display, and means for calculat- 
ing and judging the interaction of the cursor display and 
25 the display object, manipulation is realized smoothly by 
the interactions along the intent of the operator when 
gripping the displayed virtual object in the case of dis- 
play of cursor display as virtual manipulator. 

In the interface apparatus thus constructed, as the 
30 user faces the recognizing means and shows, for exam- 
ple, a hand, the special shape corresponding to the 
shape of the hand is displayed as an icon in the screen 
for screen manipulation, so that control according to the 
icon display is enabled. 
35 Or when instructed by hand gesture, the given hand 
gesture is displayed as a special shape set correspond- 
ing to the shape of the hand on the display screen, and 
its move is also displayed, and, for example, a virtual 
switch or the like displayed on the display screen can be 
40 'selected by the hand gesture, or the display object dis- 
played on the screen can be grabbed or carried depend- 
ing on the purpose, and therefore without requiring 
mouse or other input device, a very simple manipulation 
of appliance is realized. 
45 it is further possible to realize the interface much 
enhanced in the ease of manipulation by sequentially 
and automatically judging the interaction with the dis- 
play object desired to be operated by the virtual manip- 
ulator according to the intent of operation of the 
so operator, as the special shape set corresponding to the 
shape of the hand works as virtual manipulator aside 
from the mere cursor. 



BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is an appearance drawing of an interface 
apparatus in a first embodiment of the invention; 
Fig. 2 is a detailed block diagram of the interface 
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apparatus in the same embodiment of the inven- 
tion; 

Fig. 3 is a diagram showing an example of shape of 
hand judged by the interface apparatus in the same 
embodiment of the invention; s 
Fig. 4 is a diagram showing an example of shape 
identifying means of the interface apparatus in the 
same embodiment of the invention; 
Fig. 5 is a diagram showing an example of opera- 
tion by an image difference operation unit in the 10 
same embodiment; 

Fig. 6 is a diagram showing an example of icon gen- 
erated by an icon generating unit in the same 
embodiment; 

Fig. 7 is an appearance drawing showing an opera- is 
tion example of the interface apparatus of the same 
embodiment; 

Fig. 8 is an appearance drawing of an interface 
apparatus in a second embodiment of the invention; 
Fig. 9 is detailed block diagram of the interface 20 
apparatus in the second embodiment of the inven- 
tion; 

Fig. 10 is a diagram showing an example of shape 
of hand judged by the interface apparatus of the 
same embodiment; 25 
Fig. 1 1 is a diagram showing an example of motion 
recognizing unit of the interface apparatus of the 
same embodiment; 

Fig. 12 is a diagram showing an example of opera- 
tion by an image difference operation unit in the 30 
same embodiment; 

Fig. 13 is a diagram showing an operation example 
of the same embodiment; 

Fig. 14 is a detailed block diagram of an interface 
apparatus in a third embodiment of the invention; 35 
Fig. 15 is a diagram showing an example of motion 
recognizing unit of the interface apparatus in the 
third embodiment of the invention; 
Fig. 16 (A) to (D) are diagrams showing examples 
of icon displayed on a display screen by the inter- 40 
face apparatus of the same embodiment; 
Fig. 17 is a diagram showing operation of motion 
recognizing unit of the interface apparatus in the 
same embodiment of the invention; 
Fig. 18 is a diagram showing operation of motion 45 
recognizing unit of the interface apparatus in the 
same embodiment of the invention; 
Fig. 19 is a diagram showing operation of motion 
recognizing unit of the interface apparatus in the 
same embodiment of the invention; so 
Fig. 20 is a diagram showing operation of motion 
recognizing unit of the interface apparatus in the 
same embodiment of the invention; 
Fig. 21 is a diagram showing an interface apparatus 
explaining a fourth embodiment; ss 
Fig. 22 (a) is a diagram showing an open state of 
cursor in an example of a cursor used in the inter- 
face apparatus of the same embodiment; 



(b) is a diagram showing a closed state of the same 
embodiment; 

(c) is a diagram showing an open state of cursor in 
an example of a cursor used in the interface appa- 
ratus of the same embodiment; 

(d) is a diagram showing a closed state of the same 
embodiment; 

(e) is a diagram showing an open state of cursor in 
an example of a cursor used in the interface appa- 
ratus of the same embodiment; 

(f) is a diagram showing a closed state of the same 
embodiment; 

Fig. 23 (a) is a diagram showing the shape of an 
example of a virtual object used in the interface 
apparatus of the same embodiment; 
(b) is a diagram showing the shape of other exam- 
ple of a virtual object used in the interface appara- 
tus of the same embodiment; 
Fig. 24 (a) is a front view showing configuration of 
cursor and virtual object in a virtual space; 
(b) is a side view showing configuration of cursor 
and virtual object in a virtual space; 
Fig. 25 is a diagram showing a display example of 
virtual space for explaining the embodiment; 
Fig. 26 is a block diagram showing an example of 
the interface apparatus of the same embodiment; 
Fig. 27 (a) is a diagram showing an example of 
input device in input means used in the interface 
apparatus of the same embodiment; 

(b) is a diagram showing an example of input device 
in input means used in the interface apparatus of 
the same embodiment; 

(c) is a diagram showing an example of input device 
in input means used in the interface apparatus of 
the same embodiment; 

Fig. 28 (a) is a diagram showing an example of 
image of a hand taken by a camera in the same 
embodiment; 

(b) is a diagram showing a binary example of image 
of a hand taken by a camera in the same embodi- 
ment; 

Fig. 29 (a) is a diagram showing an example of 
image displayed by display means used in the inter- 
face apparatus in the same embodiment of the 
invention; 

(b) is a diagram showing a second example of the 
display screen; 

(c) is a diagram showing a third example of the dis- 
play screen; 

(d) is a diagram showing a fourth example of the 
display screen; and 

Fig. 30 is an explanatory diagram for explaining a 
conventional interface apparatus. 
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DESCRIPTION OF THE PREFERRED EMBODI- 
MENTS 

(First embodiment) 

A first embodiment of the invention relates to an 
interface comprising recognizing means such as image 
pickup device for recognizing the shape of a hand of the 
operator, display means for displaying the feature of the 
shape of the hand recognized by the recognizing means 
on a screen as a special shape by an icon or the like, 
and control means for controlling the information dis- 
played on the screen by varying the shape of the hand 
by operating the special shape such as icon displayed 
on the screen by the display means as the so-called cur- 
sor. 

Fig. 1 shows the appearance of the first embodi- 
ment of the interface apparatus of the invention. Refer- 
ence numeral 1 denotes a host computer, 2 is a display 
unit, and 3 is a CCD camera for picking up an image 
The CCD camera 3 has the pickup surface located in 
the same direction as the display direction, so that the 
shape of the hand of the user can be picked up when 
the user confronts the display screen. On the display 
menu 201, 202, and icon 200 reflecting the shape of the 
hand are displayed. 

Fig. 2 is a detailed block diagram of the invention. 
The image fed in from the CCD camera is stored in a 
frame memory 21. In a reference image memory 25, a 
background image not including person taken previ- 
ously is stored as reference image. The reference 
image may be updated whenever as required. 

Shape identifying means 22 depicts the difference 
of the image saved in the frame memory and the image 
stored in the reference image memory, and removes the 
background image from the image, depicts, for exam- 
ple, the portion corresponding to the hand of the user, 
and judges if the shape is t for example, one finger as 
shown in Fig. 3 (A), two fingers as shown in Fig. 3 (B), 
or three fingers as shown in Fig. 3 (C). 

Fig. 4 shows a detailed example of the shape iden- 
tifying means 22, which comprises an image difference 
operation unit 221, a contour depicting unit 222, and a 
shape identifying unit 223. 

The image difference operation unit 221 calculates 
the difference of the image saved in the frame memory 
and the image stored in the reference image memory as 
mentioned above. As a result, the object to be detected, 
for example, the user, can be separated from the back- 
ground portion. For example, when the image difference 
operation unit 221 is composed of a simple subtraction 
circuit, as shown in Fig. 5, only the portion of the hand 
of the user in the image in the frame memory can be 
depicted. The contour depicting unit 222 depicts the 
contour shape of the object existing in the image as a 
result of operation by the image difference operation 
unit 221 . As a practical method, for example, by depict- 
ing the edge of the image, the contour shape may be 



easily depicted. 

The shape identifying unit 223 identifies specifically 
the contour shape of the hand depicted by the contour 
depicting unit 222,and judges if the shape is, for exam- 
5 pie, one finger as shown in Fig.3 (A) or two fingers as 
shown in Fig. 3 (B). As the shape identifying method, for 
example, template matching, matching technique with 
shape model, and neural network may be employed, 
among others. 

io An icon generating unit 24 generates an icon image 
as a special shape to be shown in the display, on the 
basis of the result of identifying the hand shape by the 
shape identifying unit 223. For example, when the result 
of identifying the shape of the hand was one finger, an 
is icon of numeral "1 " is generated as shown in Fig.6 (A), 
or in the case of two fingers, an icon of numeral "2" is 
created as in Fig. 6 (B). As the shape of the icon, alter- 
natively, when the result of identifying the shape of the 
hand was one finger, an icon of one finger may be 
20 shown as shown in Fig. 6 (C), or in the case of two fin- 
gers, an icon of two fingers may be created as shown in 
Fig. 6 (D). A display controller 23 controls the display on 
the basis of the result of identifying the shape of the 
hand by the shape identifying unit 223. For example, 
25 while displaying the icon according to the result of iden- 
tifying, the menu previously corresponding to the result 
of identifying is displayed by emphasis on the basis of 
the hand shape identifying result. 

In this embodiment of the invention, an example of 
30 operations described below. As shown in Fig. 7 (A), 
when the user confronts the appliance having the inter- 
face apparatus of the invention and points out one fin- 
ger, an icon of numeral "1" is shown on the display, and 
the display of television on the first menu is shown by 
35 emphasis. At this time, by using sound or voice from the 
display device in tune with the emphasis display, the 
attention of the operator may be attracted. Herein, by 
pointing out two fingers as in Fig. 7 (B), an icon of 
numeral "2 H is shown on the display and the display of 
40 network on the second menu Is shown by emphasis. In 
this state, by maintaining the same hand shape for a 
specific time, the second menu is selected, and an 
instruction is given to the host computer so as to display 
the network terminal. For selection of menu, sound or 
45 the like may be used at the same time. In the case of 
hand shape different from those determined preliminar- 
ily as in Fig. 7 (C), icon and menu are not shown on the 
display, and no instruction is given to the host computer 
Thus, according to the invention, by identifying the 
so shape of the hand in the taken image, it is possible to 
control the computer or appliance on the basis of the 
result of identifying, and it is possible to manipulate 
without making contact from a remote distance without 
using keyboard, mouse or other device. Besides, as the 
55 result of identifying the shape of the hand is reflected in 
the screen, the user can manipulate while confirming 
the result of identifying, and ease and secure manipula- 
tion is possible. 
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In this embodiment, this is an example of applying 
in selection of menu, but by pressing so that the icon 
display according to a specific shape of hand may be 
replaced by picture or message, it is also possible to 
control display and writing of picture or message. 5 

(Second embodiment) 

A second embodiment of the invention relates to an 
interface apparatus comprising a frame memory com- 10 
posed at least of an image pickup unit, a motion recog- 
nizing unit for recognizing the shape or move of an 
object in a taken picture, and a display unit for displaying 
the shape or move of the object recognized by the 
motion recognizing unit, for storing the image taken by 15 
the image pickup unit, and a reference image memory 
for storing the image taken before the image saved in 
the frame memory as reference image, wherein the 
motion recognizing unit comprises an image change 
depicting unit for depicting the difference between the 20 
image in the frame memory and the reference image 
stored in the reference image memory. 

Fig. 8 shows the appearance of the second embod- 
iment of the interface apparatus of the invention. In Fig. 
8, same constituent elements as in the first embodiment 25 
are identified with same reference numerals. That is, 
reference 1 is a host computer, 2 is a display unit, and 3 
is a CCD camera for picking up an image. The CCD 
camera 3 has an image pickup surface located in the 
same direction as the display direction, so that the hand 30 
gesture of the user can be picked up as the user con- 
fronts the display surface. On the display surface of the 
display unit 2, virtual switches 204, 205,206, and an 
icon of an arrow cursor 203 for selecting the virtual 
switches are displayed. 35 

Fig.9 is a block diagram showing a specific constitu- 
tion of the embodiment. The image fed through the CCD 
camera 3 is saved in a frame memory 21 . A preliminarily 
taken image is stored in a reference image memory 25 
as a reference image. A reference image updating unit 40 
26 is composed of a timer 261 and an image updating 
unit 262, and is designed to update the reference image 
by transferring the latest image stored in the frame 
memory 21 to the reference image memory 25 at a spe- 
cific time interval indicated by the timer 261 . 45 

An motion recognizing unit 22 depicts the differ- 
ence between the image saved in the frame memory 
and the image stored in the reference Image memory, 
and eliminates the background image from the image, 
and depicts the portion corresponding, for example, to so 
the hand of the user, and also judges if the shape is one 
finger as shown in Fig. 10 (A) or a fist as shown in Fig. 
10(B). 

Fig. 1 1 shows a detailed example of the motion rec- 
ognizing unit 22. being composed of an image differ- 55 
ence operation unit 221 , a contour depicting unit 222, a 
shape change identifying unit 225, and a position detec- 
tor 224. 



The image difference operation unit 221 calculates 
the difference between the image saved in the frame 
memory 21 and the image stored in the reference image 
memory 25 as mentioned above. Consequently, the 
object desired to be depicted as motion, for example, 
the hand portion of the user, can be separated from the 
background portion, and only the moving object image 
can be depicted at the same time. For example, when 
the image difference operation unit 221 is composed of 
a mere subtraction circuit, as shown in Fig. 12, the hand 
portion in the reference image and only the hand portion 
of the latest image in the frame memory can be 
depicted, so that only the moving hand portion can be 
easily identified. The contour depicting portion 222 
depicts the object existing in the image as the result of 
operation by the image difference operation unit 221, 
that is, the contour shape of the hand portion before 
moving and after moving. As an example of practical 
method, by depicting the edge of the image, the contour 
shape can be easily depicted. 

The shape change identifying unit 225 identifies the 
detail of the contour shape of the hand portion after 
moving being depicted by the contour depicting unit 
222, and judges if the shape is, for example, a finger as 
shown in Fig. 10 (A), or a fist as shown in Fig. 10 (B). At 
the same time, the position detector 224 calculates the 
coordinates of the center of gravity of the contour shape 
of the hand portion of the user after moving. 

An icon generating unit 24 generates an icon image 
to be shown on the display on the basis of the result of 
identifying the hand shape by the shape change identi- 
fying unit 225. As examples of icon image, for example, 
when the result of identifying the hand shape is one fin- 
ger, for example, the arrow marked icon as shown in 
Fig. 13 (A) may be generated, or in the case of a first 
shape, the x-marked icon as shown in Fig. 13 (B) may 
be generated. Or, if the identifying result of hand shape 
is two fingers, an icon mimicking two fingers as shown, 
for example, in Fig. 13 (C) may be generated, or in the 
case of a first, an icon mimicking the fist may be gener- 
ated as shown in Fig. 13 (D). 

A display controller 23 controls the display position 
of the icon generated by the icon generating unit 24 on 
the display 2, and is composed of a coordinate trans- 
forming unit 231 and a coordinate inverting unit 232. 
The coordinate transforming unit 231 transforms from 
the coordinates of the taken image into display coordi- 
nates on the display 2, and the coordinate inverting unit 
232 inverts the lateral positions of the transformed dis- 
play coordinates. That is, the coordinates of the center 
of gravity in the image of the portion corresponding to 
the user's hand detected by the position detector 224 
are transformed into display coordinates on the display 
2, and the lateral coordinates are inverted to display an 
icon in the display 2. By this manipulation, when the 
user moves the hand to the right, the icon moves to the 
right on the display screen, like a mirror action. 

In thus constituted embodiment, an example of 
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operation is described below. As shown in Fig. 8, when 
the user confronts the appliance incorporating the inter- 
face apparatus of the embodiment and points out one 
finger of the hand, the arrow cursor appearing on the 
display moves to an arbitrary position corresponding to 5 
the move of the hand. Then, by moving the hand to an 
arbitrary one of the virtual switches 204, 205, 206 
shown on the display 2, the arrow cursor is moved, and 
when the hand is gripped to form a fist, the one of the 
virtual switches 204. 205, 206 is selected, and an 10 
instruction is given to the host computer 1 . 

In this embodiment, it is designed to recognize the 
shape and move of the object in the taken image, but it 
may be also designed to recognize either the shape or 
the move of the object in the taken image. is 

Thus, according to the invention, which comprises 
the motion recognizing unit for recognizing the shape 
and/or move of the objectin the taken image, display unit 
for displaying the shape and/or move of the object rec- 
ognized by the motion recognizing unit, frame memory 20 
for saving the image taken by the image pickup means, 
and a reference image memory for storing the image 
taken before the image saved in the frame memory as 
reference Image, by depicting the difference between 
the image in the frame memory and the reference 25 
image stored in the reference image memory in the 
motion recognizing unit, when the user confronts the 
image pickup unit and gives instruction by. for example, 
a hand gesture, the given hand gesture is shown on the 
display screen, and a virtual switch or the like shown on 30 
the display screen can be selected, for example, by the 
hand gesture, and a very simple manipulation of appli- 
ance without requiring input device such as mouse is 
realized. 



(Third embodiment) 



35 



A third embodiment of the invention relates to an 
interface apparatus comprising contour depicting 
means composed of at least an image pickup unit, a 40 
motion recognizing unit for recognizing the shape 
and/or move of the hand of the user in the taken image, 
and a display unit for displaying the shape and/or move 
of the hand of the user recognized by the motion recog- 
nizing unit, thereby depicting the contour of the taken 45 
user image, a contour waveform operation unit for trac- 
ing the depicted contour, and calculating the relation 
between the angle of the contour line and the length of 
contour line, that is, the contour waveform, and a shape 
filter for filtering the contour waveform calculated by the so 
contour waveform operation unit for generating a shape 
waveform expressing a specific shape, whereby com- 
posing the motion recognizing unit. 

When the user confronts the image pickup unit of 
thus constituted interface apparatus and instructs by ss 
hand gesture, the image pickup unit picks up the user's 
image. The contour depicting means depicts the con- 
tour of the user's image, and the contour is transformed 



into an angle of the contour line corresponding to the 
horizontal line, that is, as a contour waveform, by the 
contour waveform operation unit, on the horizontal axis 
in the length of the contour line starting from the refer- 
ence point on the contour. This contour waveform is 
transformed into a shape waveform expressing the une- 
ven shape of the finger by a shape filter composed of a 
band pass filter in specified band, for example, a band 
pass filter corresponding to uneven surface of finger, 
and the location of the finger is calculated, and only by 
counting the number of pulses existing in this shape 
waveform, the number of projected fingers, that is, the 
shape of the hand can be accurately judged. On the 
basis of the position or shape of the hand, the given 
hand gesture is shown on the display screen, and, for 
example, a virtual switch shown on the display screen 
can be selected by the hand gesture, and therefore very 
simple manipulation of appliance is realized without 
requiring mouse or other input device. 

Moreover, plural shape filters may be composed of 
plural band pass filters differing in band, and the motion 
of the user may be judged on the basis of the shape 
waveforms generated by the plural shape filters. As a 
result, plural shapes can be recognized. 

Alternatively, plural shape filters may be composed 
of at least of a band pass filter in the contour waveform 
shape corresponding to undulations of hand, and a 
band pass filter in the contour waveform shape corre- 
sponding to undulations of fingers. As a result, the 
image is transformed into a smooth shape waveform 
reflecting only undulations of the hand portion or into 
the shape waveform reflecting only undulations of the 
finger. 

The motion recognizing unit may be constituted by 
comprising coordinate table for storing the contrast of 
coordinates of the contour shape of the taken image of 
the user and the contour shape calculated by the con- 
tour shape operation unit, and a coordinate operation 
unit for calculating the coordinates of the location of the 
specified shape in the taken image, by using the wave 
crest location position of the shape waveform and the 
coordinate table. Hence, the coordinates of the contour 
shape are calculated, and the coordinates are issued. 

The motion recognizing unit may be also composed 
by comprising a shape judging unit for counting the 
number of pulses in the shape waveform generated by 
the shape filter, and the shape of the object may be 
judged by the output value of the shape judging unit. It 
is hence easy to judge whether hand is projecting two 
fingers or gripped to form a fist, by the number of 
pulses. 

Also the motion recognizing unit may be composed 
by comprising a differentiating device for differentiating 
the shape waveform generated by the shape filter. By 
differentiation, the waveform is more pulse-like, and it is 
easier to count the number of pulses. 

The appearance of the embodiment of the interface 
apparatus of the invention is similar to the one shown in 
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Fig. 8 relating to the second embodiment, and same 
parts as in the second embodiment are explained by 
referring to Fig. 8 and Fig. 10, and only other different 
parts are explained in Fig. 14 and after. 

Fig. 14 is a detailed block diagram of the third s 
embodiment of the invention. An image fed from the 
CCD camera 3 is stored in a frame memory 31. The 
motion recognizing unit 32 depicts the portion corre- 
sponding, for example, to the hand of the user from the 
image stored in the frame memory 31 , and judges if the 
shape is, for example, one finger as shown in Fig. 10 
(A), or a fist as shown in Fig. 10 (B). 

Fig. 15 shows a detail of execution of the motion 
recognizing unit 32, and its detailed operation is 
described while referring also to Fig. 16 to Fig. 20. 

A contour depicting unit 321 depicts the contour 
shape of the object existing in the image. As an example 
of specific method, the image is transformed into binary 
data, and by depicting the edge, the contour shape can 
be easily depicted. Fig. 17 (A1) is an example of 
depicted contour line, showing the hand projecting one 
finger. 

A contour shape operation unit 322, starting from 
start point s in the diagram of the contour shape of the 
object depicted by the contour depicting unit 321 as 
shown in Fig. 17 (A1), traces the contour line in the 
direction of arrow in the drawing (counterclockwise), 
depicts the angle 0 from the horizontal line of the con- 
tour line at each point x on the contour line as shown in 
Fig. 1 9 as the function in terms of the distance I from the 
start points, and transforms into the waveform shape 
regarding the distance I as the time axis as shown in 
Fig. 17 (B1), and simultaneously stores the coordinates 
of each point on the contour line corresponding to the 
distance I in a transformation table 324 in a table form. 
A shape filter 1 and a shape filter corresponding to ref- 
erence numerals 3231 and 3232 respectively are filters 
for passing the band corresponding to the undulations 
of the hand portion and undulations of the finger por- 
tion, in the contour waveform shown in Fig. 17 (B1). 

By the shape filter 1, Fig. 17 (B1) is transformed 
into a smooth shape waveform reflecting only the undu- 
lations of the hand portion as shown in Fig. 17 (B1 1), 
and by the shape filter 2, it is transformed into the shape 
waveform reflecting only the undulations of the finger as 
shown in Fig. 17 (B12), and both are differentiated by 
differentiating devices 3251 and 3252, and finally differ- 
ential waveforms as shown in Fig. 1 7 (B1 12) and Fig. 1 7 
(B122) are obtained. The shape judging unit 3262 
judges whether the contour shape of the hand portion is 
two fingers as shown in Fig. 10 (A) or a fist as shown in 
Fig. 10 (B), and at the same time the coordinate opera- 
tion unit 3261 calculates the coordinates of the center of 
gravity of the contour shape of the portion of the hand of 
the user. The coordinate operation unit 3261 deter- 
mines positions 1c1, 1c2 of location of large pulse 
waveforms in the shape differential waveform shown in 
Fig. 17 (B1 12) r and transforms into point c1 and point c2 
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shown in Fig. 20 by the coordinate transformation table 
324, and the center of gravity of the hand portion is cal- 
culated from the contourline of the hand portion from 
point c1 to point c2, and issued as hand coordinates. 

The shape judging unit 3262 counts and issues the 
number of pulse waveforms corresponding to the finger 
portion in the shape differential waveform in Fig. 17 
(B122). That is, in the case of Fig. 17 (B122), since 
there are two large pulse waveforms corresponding to 
the portion of the finger, it is judged and issued as the 
shape of two fingers as shown in Fig. 10 (A). Or when 
the hand is gripped as shown in Fig. 18 (A2), there is 
almost no undulation of finger portion, and the output of 
the shape filter 2 is a shape waveform without undula- 
tions as shown in Fig. 1 8 (B22), and hence the output of 
the differentiating device 3262 is also a shape differen- 
tial waveform without pulse waveform as shown in Fig. 
18 (B222), and the count of pulse waveforms is 0, and it 
is hence judged and issued as the fist shape as shown 
in Fig. 10(B). 

As a practical example of composition of the shape 
judging unit 3262, simple threshold processing method 
or neutral network maybe employed. 

An icon generating unit 34 in Fig. 14 generates an 
icon image to be shown on the display on the basis of 
the result of identifying the shape of the hand by the 
shape judging means 3262 in Fig. 15. For example, if 
the result of identifying the hand shape is a shape of 
one finger, for example, an icon indicated by arrow 
shown in Fig. 16 (A) is created, or in the case of a fist 
shape, an icon indicated by x mark as shown in Fig. 16 
(B) is created. A display controller 33 controls the dis- 
play position of the icon generated by the icon generat- 
ing unit 34 on the display, and is composed of 
coordinate transforming unit 331 and coordinate invert- 
ing unit 332. The coordinate transforming unit 331 
transforms from the coordinates of the taken image into 
display coordinates on the display, and the coordinate 
inverting unit 332 inverts the lateral positions of the 
transformed display coordinates. That is, the coordi- 
nates of the center of gravity in the image of the portion 
corresponding to the user's hand issued by the coordi- 
nate operation unit 3261 in Fig. 15 are transformed into 
display coordinates on the display, and the lateral coor- 
dinates are inverted to display an icon in the display. By 
this manipulation, when the user moves the hand to the 
right, the icon moves to the right on the display screen, 
like a mirror action. 

In thus constituted embodiment, an example of 
operation is described below. When the user confronts 
the appliance incorporating the interface apparatus of 
the embodiment and points out one finger of the hand, 
the arrow cursor of the icon 203 appearing on the dis- 
play 2 moves to an arbitrary position corresponding to 
the move of the hand. Then, by moving the hand to an 
arbitrary one of the virtual switches 204, 205, 206 
shown on the display 2, the arrow cursor is moved, and 
when the hand is gripped to form a fist, the one of the 
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virtual switches is selected, and an instruction is given 
to the host computer 1 . 

As an example of the icon to be displayed, as 
shown in Fig. 16 (C), (D), when the shape of the hand 
itself is formed into an icon, it corresponds to the move 
of the actual hand, and it is intuitive. More specifically, 
the images as shown in Fig. 16 (C) and (D) may be 
entered beforehand, or the contour shape data of the 
hand depicted by the contour depicting unit may be con- 
tracted or magnified to a desired size and used as an 
icon image. 

Thus, in this embodiment, when the user confronts 
the image pickup unit of the interface apparatus and 
instructs, for example, by a hand gesture, the image 
pickup unit picks up the image of the user, and depicts 
the contour of the user's image, and transforms into an 
angle of contour line to the horizontal line, that is, into 
contour waveform, on the horizontal axis in the length of 
the contour line starting from the reference point on the 
contour line. This contour shape is transformed into a 
shape waveform expressing the uneven shape of the 
fingers by the shape filter composed of a band pass fil- 
ter of specified band, for example, a band pass filter cor- 
responding to undulations of fingers, and the position of 
the hand is calculated, and simultaneously the number 
of pulses existing in the shape waveform is counted, so 
that the number of projected fingers, that is, the shape 
of the shape can be accurately judged. On the basis of 
the position and shape of the hand, the given hand ges- 
ture is shown on the display screen, and, for example, a 
virtual switch shown on the display switch can be 
selected, and very easy manipulation of appliance is 
realized without requiring mouse or other input device. 

(Fourth embodiment) 

The foregoing embodiments relate to examples of 
manipulation on two-dimensional images shown on the 
display screen, whereas this embodiment relates to 
manipulation of a virtual three-dimensional image 
shown on a two-dimensional display screen. 

Generally, assuming to grasp a virtual object in a 
virtual space by using a cursor, in a displayed virtual 
three-dimensional space, the following constitution is 
considered. 

In Fig. 21 , reference numeral A1 is an input device, 
A2 is a cursor coordinate memory unit, A3 is an object 
coordinate memory unit, A4 is a display device, and A5 
is a contact judging unit. Fig.22 (a) and Fig, 22 (b) show 
cursors in two-finger manipulator shape that can be 
expressed from the shape of the hand of the operator 
same as in the foregoing embodiments. Fig. 22 (a) 
shows an open finger state, and Fig. 22 (b) a closed fin- 
ger state. Fig. 23 shows an example of a virtual object in 
a virtual space. Suppose the operator acts to grab the 
virtual object in a virtual three- dimensional space by 
using a two-finger cursor. Fig. 24 (a) and Fig.24 (b) 
show configuration of cursor and virtual object in the vir- 



tual space when gripping the virtual object by using the 
cursor. Fig. 25 shows the display of the display device 
A4. 

When manipulation of the operator is given to the 
s input unitAI , the cursor coordinates and the two-finger 
interval of the cursor stored in the cursor coordinate 
memory device A2 are updated according to the manip- 
ulation. The display device A4 depicts the virtual space 
including the cursor and virtual object by using the infor- 
io mation stored in the cursor coordinate memory unit A2 
and the position information of the virtual object stored 
in the object coordinate memory unit A3. Herein, the 
contact judging unit A5 calculates whether the cursor 
and virtual object contact with each other in the virtual 
15 space or not, by using the information stored in the cur- 
sor coordinate memory unit A2 and the position infor- 
mation of the virtual object stored in the object 
coordinate memory unit A3. More specifically, the dis- 
tance between plural surfaces composing the cursor 
20 and virtual object in the virtual space is calculated on 
each surface, and when the virtual object contacts 
between two fingers of cursor, it is judged that the cur- 
sor has grabbed the virtual object, and thereafter the 
coordinates of the object are changed according to the 
25 move of the cursor. 

In such technique, however, the display by the dis- 
play device is as shown in Fig. 25 in the case of config- 
uration as shown in Fig. 24 (a) or (b), and the operator 
may misjudge that the coordinates are matched 
30 although the cursor and virtual object position are not 
matched exactly in the virtual space. Or, in the case of 
using the three-dimensional display device or in the 
case of combined display of Fig. 24 (a) and (b), smooth 
manipulation is difficult due to difference in the sense of 
35 distance in the actual space and in the virtual space. 

Thus, due to difference between the sense of dis- 
tance in a virtual space which is a display space and the 
sense of distance in an actual space, or due to differ- 
ence between the motion of cursor intended by the 
40 operator and the actual motion of cursor, interaction 
according to the intent of the operator (in this case, 
grabbing of the virtual object) cannot be realized 
smoothly in the interaction of the cursor and virtual 
object in virtual space (for example, when grabbing the 
45 virtual object by a virtual manipulator). 

In this embodiment, the cursor can be controlled 
with ease by hand gesture or the like by the operator in 
the virtual space without making contact, and presence 
or absence of occurrence of interaction with the virtual 
so object in the virtual space is determined not only by the 
distance between the cursor in the virtual space and the 
constituent element of the virtual space (the surface in 
the case of three-dimensional virtual space), but by 
inducing interaction also on the object of which distance 
55 in the virtual space is not necessarily dose by the inter- 
action judging means, the judgment whether the cursor 
induces interaction with the virtual object is made closer 
to the intent of the operator, so that the controllability of 
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the interface may be further enhanced. It is further pos- 
sible to induce interaction also on the object of which 
distance in the virtual space is not necessarily close. 

A first constitution of the embodiment is an interface 
apparatus comprising display means, input means for s 
changing the position and shape of the cursor displayed 
in the display means, cursor memory means for storing 
coordinates of a representative point representing the 
position of the cursor and the shape of the cursor, object 
memory means for storing coordinates of a representa- 10 
tive point representing the position of display object 
other than the cursor and shape of the display object, 
and interaction judging means for judging interaction 
between the cursor and the display object, by using the 
position and shape of the cursor stored in the cursor 15 
memory means and the position and shape of the dis- 
play object stored in the object memory means, wherein 
the interaction judging means is composed of distance 
calculating means for calculating the distance between 
at least one representative point of the cursor and at 20 
least one representative point of the display object, 
motion recognizing means for recognizing the move of 
the cursor or change of the shape, and overall judging 
means for determining the interaction of the cursor and 
display object by using the distance calculated by the 25 
distance calculating means and the result of recognition 
by the motion recognizing means. 

According to this constitution, presence or absence 
of occurrence of interaction between the cursor manip- 
ulated by the operator in the virtual space and the virtual 30 
object in the virtual space is determined not only by the 
distance between the cursor in the virtual space and the 
constituent element of the virtual object (the surface in 
the case of a three-dimensional virtual space), but the 
overall judging means judges the presence or absence 35 
of occurrence of interaction by the distance between 
representative points calculated by the distance calcu- 
lating means and the motion of the cursor recognized by 
the motion recognizing means, so that interaction may 
be induced also on the object of which distance is not 40 
necessarily close in the virtual space. 

When the motion recognizing means recognizes a 
preliminarily registered motion in the first constitution, a 
second constitution may be designed so that the inter- 
action judging means may induce an interaction on the 45 
display object of which distance calculated by the dis- 
tance calculating means is below a predetermined ref- 
erence. 

In the first and second constitutions, by installing 
move vector calculating means for calculating the mov- so 
ing direction and moving distance of the cursor in the 
display space to compose the interaction judging 
means, a third constitution may be composed so as to 
determine the interaction of the cursor and display 
object by the interaction judging means on the basis of 55 
the moving direction of the cursor and moving distance 
of the cursor calculated by the move direction calculat- 
ing means. 
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The third constitution may be modified into a fourth 
constitution in which the interaction judging means gen- 
erates an interaction when the cursor moving distance 
calculated by the move vector calculating means is less 
than the predetermined reference value. 

In third and fourth constitutions, the interaction 
judging means may generate an interaction on the dis- 
play object existing near the extension line in the moving 
direction of the cursor calculated by the move vector 
calculating means, so that a fifth constitution may be 
composed. 

In the first to fifth constitutions, the interaction judg- 
ing means may generate an interaction when the shape 
of the cursor and shape of the display object become a 
preliminarily registered combination, which may be 
composed as a sixth constitution. 

In the first to sixth constitutions, by composing the 
interaction judging means by incorporating shape judg- 
ing means for recognizing the shape of the cursor and 
shape of the display object, a seventh constitution may 
be constructed so that the interaction judging means 
may generate an interaction when the shape of the cur- 
sor and shape of the display object recognized by the 
shape recognizing means coincide with each other. 

In the first to seventh constitutions, by comprising 
sight line input means for detecting sight fight direction, 
an eighth constitution may be composed in which the 
interaction judging means generates an interaction 
when the motion recognizing means recognizes a pre- 
liminarily registered motion, on the display object near 
the extension line of the sight light detected by the sight 
line input means. 

The eighth constitution may be modified into a ninth 
constitution in which the interaction judging means gen- 
erates an interaction when the cursor is present near 
the extension line of the sight line on the display object 
near the extension line of the sight line detected by the 
sight line input means and the motion recognizing 
means recognizes a preliminarily registered motion. 

In the first to ninth constitutions, when an interac- 
tion is generated, learning means may be provided for 
learning the configuration of the cursor and the objec- 
tive display object, and the shape of the cursor and 
shape of the display object, so that a tenth constitution 
may be composed for determining the interaction on the 
basis of the learning result of the learning means by the 
interaction judging means. 

The tenth constitution may be modified into an elev- 
enth constitution in which the interaction judging means 
generates an interaction when the configuration of the 
cursor and objective display object, or the shape of the 
cursor and shape of the display object may be similar to 
the configuration or shapes learned in the past by the 
learning means. 

Instead of the first to eleventh constitutions, a 
twelfth embodiment may be composed in which the 
interaction judging means may be composed by incor- 
porating coordinate transforming means for transform- 
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ing the coordinates from the cursor memory unit and 
object memory unit to the input to the distance calculat- 
ing means. 

The twelfth constitution may be modified into a thir- 
teenth constitution in which the cursor and objective dis- 
play object may be brought closer to each other when 
an interaction is generated. 

The fourth embodiment is described in detail by 
r eferring to the drawing. Fig. 26 is a block diagram of the 
interface apparatus of the embodiment. 

In Fig. 26, reference numeral 41 is input means, 42 
is cursor memory means, 43 is object memory means, 
44 is display means,45 is interaction judging means, 
45a is distance calculating means, 45b is motion recog- 
nizing means, 45c is overall judging means, 45d is move 
vector calculating means, 45e is shape judging means, 
45f is learning means, 45g is coordinate transforming 
means, and 46 is sight line input means. 

In Fig. 26, the operator manipulates input means 
41, the cursor memory means 42 changes and stores 
the coordinates and shape of representative point rep- 
resenting the position in the virtual space of the cursor, 
and the display means 44 shows the cursor and virtual 
object in two-dimensional display or three-dimensional 
display on the basis of the coordinates and shape of the 
representative point representing the position in the vir- 
tual space of the cursor stored in the cursor memory 
means 42 and the coordinates and shape of represent- 
ative point representing the position in the virtual space 
of the virtual object stored in the object memory means 
43. 

The sight line input means 46 detects the position 
of the sight line of the operator on the display. The dis- 
tance calculating means 45a calculates the distance 
between the cursor and virtual object in the virtual 
space on the basis of the coordinates of the represent- 
ative points stored in the cursor memory means 42 and 
object memory means 43. The motion recognizing 
means 45b recognizes the motion of manipulation on 
the basis of the data stored in the cursor memory 
means 42 and object memory means 43. The move 
vector calculating means 45d calculates the moving 
direction and moving distance of the cursor in the virtual 
space. The shape judging means 45e judges whether 
the shape of the cursor and shape of the virtual object 
are appropriate for inducing an interaction or not. The 
learning means 45f stores the relation of position and 
shape of the cursor and virtual object when the overall 
judging means 45c has induced an interaction between 
the cursor and virtual object, and tells whether the 
present state is similar to the past state of inducing 
interaction or not. 

The overall judging means 45c judges whether the 
cursor and virtual object interact with each other or not, 
on the basis of the distance between the cursor and vir- 
tual object issued by the distance calculating means 
45a, the result recognized by the motion recognizing 
means 45b, moving direction and moving distance of 



cursor calculated by the move vector calculating means 
45d, the position of sight line detected by the sight light 
input means 46, judging result of the shape judging 
means 45e, and degree of similarity to the past interac- 

5 tion issued by the learning means 45f,and changes the 
coordinates and shape of the representative points of 
the cursor and virtual object depending on the result of 
interaction. The coordinate transforming means 45g 
transforms the coordinates of the cursor and objective 

10 virtual object in the virtual space used in the distance 
calculation by the distance calculating means 45a so 
that the positions of the two may be closer to each other 
when the interaction judging means 45 induces an inter- 
action. 

Fig. 22 (a) and (b) show a two-finger manipulator 
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shape in a first example of the cursor used in the inter- 
face apparatus of the invention. In the diagram, the fin- 
gers are opened in (a) and the fingers are closed in (b). 
Fig. 22 (c) and (d) show a two-finger two-joint manipula- 
te tor shape in a second example of the cursor used in the 
interface apparatus of the invention. In the diagram, the 
fingers are opened in (c) and the fingers are closed in 

(d) . Fig. 22 (e) and (f) show a five-finger hand shape in 
a third example of the cursor used in the interface appa- 

25 ratus of the invention. In Fig. 22, the hand is opened in 

(e) and the hand is closed in (f). 
Fig. 23 (a) and (b) show examples of the object in 

the virtual space used in the interface apparatus of the 
invention, showing a cube in (a) and a plane in (b). 
30 In the interface apparatus thus constituted, the 
operation is described below. In this embodiment, it is 
supposed that the operator moves the cursor as shown 
in Fig. 22 in a three-dimensional virtual space, and 
moves by grabbing the virtual object as shown in Fig. 23 
35 existing in the virtual space. 

Manipulation of the operator is effected on the input 
means 41. In the input herein, as the input device for 
feeding information for varying the position or shape of 
the cursor, the means as shown in Fig. 27 (a) to (c), or 
40 the camera, keyboard, or command input by voice rec- 
ognition can be used. Fig. 27 (a) relates to a mouse, 
and the cursor is manipulated by moving the mouse 
main body or clicking its button. Fig. 27 (b) relates to a 
data glove, which is worn on the hand of the operator, 
45 and the cursor is manipulated by reflecting the finger 
joint angle or position of the data glove in the actual 
space in the position and shape of the cursor. Fig. 27 (c) 
relates to a joy stick, and the cursor is manipulated by 
combination of lever handling and operation button. 
so When using a camera, the body or part of the body (for 
example, the hand) is taken by the camera, and the 
shape and position of the hand are read. 

Fig. 28 shows an example of shape depiction when 
only the hand is taken by the camera. In Fig. 28 (a), the 
55 hand is taken by the camera. The luminance of pixels of 
the image in Fig. 28 (a) is converted into binary data in 
Fig. 28 (b). In Fig. 28 (b), it is possible to judge the 
degree of opening or closing of the hand by the ratio of 
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the longer side and shorter side of a rectangle circum- 
scribing a black region, and input of position and dis- 
tance is enabled from the coordinates of center of 
gravity and area of the entire black pixels. The input 
means 41 sends the manipulation data(cursor moving 5 
distance, cursor shape changing amount, etc.) to the 
cursor memory means 42. 

The cursor memory means 42 stores the coordi- 
nates and shape of the representative point of the cur- 
sor in the virtual space stored in the cursor memory 10 
means on the basis of the manipulation data sent out by 
the input means 41. As the representative point, the 
coordinates of center of gravity of cursor (XO, YO, Z0) 
may be used. Moreover, as the representative point, the 
coordinates of the center of each surface composing 15 
the cursor or the coordinates of apex may be also used. 
As the shape, the two-finger interval d in the case of Fig. 
22 (a), or the internal angle 0 n of the joint of each finger 
in the case of Fig. 22 (b), (c) (n is a joint number: as 9 n 
becomes smaller, the joint is bent) may be used as the 20 
storage information. Moreover, as the shape, the finger 
tip of each finger or the coordinates of each joint in the 
virtual space may be also used. 

The object memory means 43 stores the coordi- 
nates and shape of the representative point of the vir- 25 
tual object in the virtual space shown in Fig. 23 as the 
object of manipulation. As the representative point, the 
coordinates of the center of gravity of virtual object 
(cube: (X1, Y1, Z1), plane: (X2 ( Y2, Z2)) are used. Also 
as the representative point, the coordinates of the 30 
center of each surface composing the virtual object or 
the coordinates of the apex may be used. As the shape, 
parameter a showing a predetermined shape is stored 
(herein cube is defined as a=1 , and plane as a =2). Also 
as the shape, the coordinates of apex may be used. 35 

The display means 44 shows the image in two- 
dimensional display as seen from the viewpoint prelimi- 
narily assuming the virtual space on the basis of the 
information of the position and shape of the cursor and 
virtual object stored in the cursor memory means 42 40 
and object memory means 43. Fig. 29 (a) shows a dis- 
play example of display means. When the operator 
manipulates, the display position or shape of the cursor 
is changed, and the operator continues to manipulate 
according to the display. 45 

The interaction judging means 45 judges if the cur- 
sor has grabbed the object or not (presence or absence 
of interaction) every time the cursor position changes, 
and when it is judged that the cursor has grabbed the 
object, the coordinates of the virtual object are moved so 
according to the move of the cursor. 

The distance calculating means 45a calculates the 
distance between the coordinates of the center of grav- 
ity of the cursor (XO, YO, ZO) stored in the cursor mem- 
ory means 42 and the coordinates of the center of ss 
gravity of the virtual object (X1, Y1, Z1), (X2, Y2, Z2) 
stored in the object memory means 43. 

The motion recognizing means 45b recognizes the 
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motion of "grab" as the preliminarily registered motion 
by using the change of shape of the cursor. In the case 
of the cursor in Fig. 22 (a), the decreasing state of the 
interval d of two fingers is recognized as the "grab" 
action, and in the case of the cursor in Fig. 22 (b), 
(c),the decreasing state of the angle 9 n of all fingers is 
understood as the "grab" action. As the technique of 
recognizing the motion, meanwhile, the time series 
changes of the parameters representing the shape 
(such as d and 0 n mentioned above) may be used as 
the recognizing technique after learning specific 
motions preliminarily by using the time series row pat- 
tern recognition techniques (table matching, DP match- 
ing, hidden Markoff model (HMM), recurrent neutral 
network, etc.). 

The move vector calculating means 45d calculates 
the moving direction and moving distance of the cursor 
in the virtual space by using the changes of the coordi- 
nates of the center of the cursor (XO, YO, Z0). For exam- 
ple, the direction and magnitude of the differential vector 
of the coordinates of the center of gravity of the present 
time t (XO, YO, Z0)t and the coordinates of the center of 
gravity of a certain previous time (XO, YO, Z0)t-1 are 
used as moving distance of the cursor 

The shape judging means 45e judges if the shape 
of the cursor for storing in the cursor memory means is 
proper or not for grabbing the virtual object in the shape 
stored in the object memory means (whether the cursor 
shape is appropriate or not for inducing interaction with 
the virtual object). Herein, when the value of the param- 
eter a representing the shape of the object is 1 , the cur- 
sor finger open state is regarded as an appropriate 
state. The cursor finger open state is judged, for exam- 
ple, when the value of d is larger than the intermediate 
value of maximum value of d, that is, dmax, and 0 in the 
case of the cursor shown in Fig. 22 (a), and when all 
joint angles 8 n are greater than the intermediate value 
of maximum value 6 n max and 0 in the case of Fig. 
22(b), (C). 

When the value of the parameter a expressing the 
object shape is 0, it is an appropriate state when the 
interval of finger tips of the cursor is narrow. The finger 
tips of the cursor are judged to be narrow when, for 
example, when the value of d is smaller than the inter- 
mediate value of the maximum value of d, d max and 0 
in the case of the cursor in Fig. 22 (a), or when all joint 
angles 8 n are smaller than the intermediate value of the 
maximum value 8 n max and 0 in the case of Fig. 22 (b), 
(c). As the judging method of the shape, incidentally, the 
parameter expressing the cursor shape (d or 9 n) is 
stored preliminarily in the state of the cursor grabbing 
the virtual object in contact state in the virtual space, 
and when the values of the parameters coincide in a 
range of ±30%, it may be judged to be appropriate for 
grabbing action. 

The sight line input means 46 detects the sight line 
of the operator, and calculates the coordinates noticed 
by the operator on the display screen of the display 
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means 44 (coordinates of notice point). As the sight light 
detecting means, by detecting the direction of the pupil 
of the operator by using a photo sensor such as CCD 
camera, the notice point on the screen is calculated by 
measuring the position of the head of the operator by 5 
using a camera or the like. 

The learning means 45f stores the parameter (d or 
0 n) showing the shape of the cursor when the overall 
judging means 45c judges that the cursor has grabbed 
the virtual object, the parameter a showing the shape of w 
the grabbed object, and the relative configuration of the 
position of the cursor and the position of the virtual 
object (the vector linking the center of gravity of the cur- 
sor and the center of gravity of the virtual object), and 
when the parameter expressing the present shape of is 
the virtual object, the parameter expressing the shape 
of the surrounding virtual object, and the configuration 
of the present center of gravity of the cursor and the 
center of gravity of the surrounding virtual object are 
close to the past state of grabbing the object (for exam- 20 
pie, when the parameters and element values of each 
dimension of the vector expressing the configuration 
coincide with the past values within a range of ±30%), it 
is judged to be close to the past situation and 1 is 
issued, and otherwise 0 is issued. As other means of 25 
learning, meanwhile, the parameter expressing the 
shape of the cursor when grabbing the object in the 
past, the parameter a expressing the shape of the 
grabbed virtual object, and the relative configuration of 
the position of the cursor and position of the virtual 30 
object maybe learned by using neural networks or the 
like. As the learning items, it may be also possible to 
learn together with the configuration of notice point 
coordinates on the screen detected by the sight line 
detecting means 46 and coordinates of the cursor on 35 
the display screen. 

The coordinates transforming means 45g trans- 
forms the coordinates used in distance calculation by 
the distance calculating means when grabbing the 
object (when an interaction is caused) so that the dis- 40 
tance between the cursor and the objective virtual 
object in the virtual space may be shorter. For example, 
supposing the coordinates to be (100, 150, 20) and 
(105, 145, 50) when the cursor grabs the virtual object, 
the coordinates transforming means transforms the Z- 45 
coordinate having the largest difference among the 
coordinates as shown in formula (1). 



Z* = 0.8 x Z 



0) 



where 2 is the Z-coordinate of the center of gravity of the 
cursor and virtual object received by the coordinates 
transforming means, and Z' denotes the Z-coordinate 
as the output of the coordinates transforming means. 

In this case, the value of X-coordinate and value of 55 
Y- coordinate are not changed. Also the values stored in 
the cursor memory means and object memory means 
are not changed, and hence the screen shown by the 



display means is not changed. By such transformation, 
if the distance in the virtual space is remote, when the 
operator attempts to grab, thereafter the distance 
between the cursor and virtual object becomes shorter 
when calculating the distance, so that the distance cal- 
culating means can calculate the distance closer to the 
sense of the distance felt by the operator. 

The overall judging means 45c judges the occur- 
rence of interaction of "grab" when recognizing the pre- 
liminarily registered action of "grab" by the motion 
recognizing means 45b when the distance between the 
cursor and the virtual object issued by the distance cal- 
culating means 45a is less than a predetermined refer- 
ence, and thereafter until the interaction of "grab" is 
terminated, the values of the coordinates of the center 
of gravity of the grabbed virtual object stored in the 
object memory means 43 are matched with the coordi- 
nates of the center of gravity of the cursor. Herein, the 
predetermined reference value may be a larger than the 
actual distance of contact of the cursor and object in the 
virtual space. For example, in the case of Fig. 25 (con- 
figuration of Fig. 24), if the distance between the virtual 
object and cursor is less than the reference value of the 
distance, the action of grabbing by the operator may be 
instructed to the input means 1, and when the motion 
recognizing means 45b recognizes the grabbing action, 
the virtual object may be grabbed and moved. 

Meanwhile, the overall judging means 45c, if there 
are plural virtual objects below the reference of the dis- 
tance, judges only the objects below the reference (for 
example, 90 degrees) of the angle formed by the line 
segment (waved line) linking the cursor and virtual 
object and the moving direction (arrow) of the cursor 
calculated by the move vector calculating means 45d as 
shown in Fig.29 (bV so that the operator can judge the 
interaction in consideration of the moving direction of 
the cursor in the process of manipulation (selecting the 
highest position of the three objects in the diagram). As 
for the cursor moving distance, if the moving distance is 
longer than the predetermined moving distance refer- 
ence, interaction does not occur. Thus, when merely 
moving the cursor, interaction not intended by the oper- 
ator is not caused. 

Moreover, as shown in Fig. 29 (c), when plural vir- 
tual objects are satisfying the reference, the virtual 
object close to the position of the notice point detected 
by the sight line input means 46 is the object of grabbing 
by the overall judging means 45c (in the diagram, the 
object at the left side close to the mark indicating the 
notice point is to be selected). As a result, the object can 
be easily selected by using the sight line of the operator. 

Incidentally, as shown in Fig. 29 (d), if there is a 
close object on the screen, the virtual object coinciding 
with the shape of the cursor judged by the shape judg- 
ing means 45e is the object of grabbing by the overall 
judging means 45c (in the diagram, since the finger 
interval of the cursor is narrow, a plane is judged to be 
appropriate as the object of grabbing, and hence the 
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plane is selected). As a result, the virtual object 
intended by the operator can be selected by the shape 
of the cursor, and the operator can manipulate easily by 
corresponding to the cursor shape that can be easily 
associated when grabbing the virtual object. 

The overall judging means 45c selects by priority 
the virtual object judged to be similar when the object 
was grabbed in the past by the learning means 451 As 
a result, the judgment closer to the past manipulation by 
the operator can be reproduced, and the controllability 
may be enhanced. 

Thus, according to the invention, presence or 
absence of interaction between the cursor manipulated 
by the operator in the virtual space and the virtual object 
in the virtual space is determined not only by the dis- 
tance between the cursor and virtual object, but it is 
determined on the basis of the action, sight line or past 
case in the manipulation of the operator, so that the con- 
trollability may be enhanced in the interface for interac- 
tion with the virtual object by using the cursor in the 
virtual space. 

The embodiments have been explained by referring 
to the action of grabbing the virtual object by using the 
cursor as the interaction, but similar handling is also 
possible in other motions, such as indicating (pointing) 
to the virtual object, collision, friction, impact, and 
remote control. Similar effects are obtained if the virtual 
space is a two-dimensional space or if the display 
means is a three-dimensional display. It may be realized 
by using hardware, or by using the software on the com- 
puter. 

In this way, according to the embodiments, pres- 
ence or absence of occurrence of interaction between 
the cursor manipulated by the operator in the virtual 
space and the virtual object in the virtual space is deter- 
mined not only by the distance between the constituent 
elements of the cursor and virtual object in the virtual 
space, but the overall judging means judges presence 
or absence of occurrence of interaction on the basis of 
the distance between representative points calculated 
by the distance calculating means and the motion of the 
cursor recognized by the motion recognizing means, 
and therefore an interaction may be induced also on the 
object of which distance is not necessarily close in the 
virtual space, so that the input and output interface 
excellent in controllability may be presented. Moreover, 
it is not necessary to calculate the distance of all the 
constituent elements between the cursor and virtual 
object in the virtual space as required in the conven- 
tional contact judging method, and therefore the quan- 
tity of calculation is lessened, and the processing speed 
is enhanced. 

Accordingly, the invention can recognize the shape 
or motion of the hand of the operator and display the 
feature of the shape of the recognized hand as a cursor 
on the screen as the special shape, so that the informa- 
tion displayed on the screen and the information dis- 
played on the screen by the shape or motion of the hand 
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can be controlled easily at superior controllability. 

Moreover, the feature of the shape of the hand is 
displayed as cursor on the screen as the special shape, 
and the relation with other display objects than the cur- 
sor is judged sequentially and automatically by the inter- 
action along the intent of the operator, so that the 
interface further enhanced in the controllability of 
manipulation of indicating or gripping the display object 
can be realized. 
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Claims 




1 , An interface apparatus comprising: 



2. An interface apparatus of claim 1 , wherein the rec- 
ognizing means recognizes the motion of the hand 
together with the shape of the hand, and the display 
means displays the features of the shape and 
motion of the hand recognized by the recognizing 
means on the screen as the special shape. 

3. An interface apparatus of claim 1 , wherein the con- 
trol means controls so as to select the information 
displayed in the screen. 

4. An interface apparatus comprising: 

a frame memory composed of 

at least an image pickup unit, 
a motion recognizing unit for recognizing 
the shape or move of an object in a taken 
picture, and 

a display unit for displaying the shape or 
move of the object recognized by the 
motion recognizing unit, for 
storing the image taken by the image 
pickup unit, and 

a reference image memory for storing the 
image taken before the image saved in the 
frame memory as reference image, 

wherein the motion recognizing unit 
comprises an image change depicting unit 
for depicting the difference between the 
image in the frame memory and the refer- 
ence image stored in the reference image 
memory. 

5. An interface apparatus of claim 4, further compris- 



ing a reference image updating unit for updating the 
reference image accumulated in the reference 
image memory into a novel image. 

6. An interface apparatus of claim 4, further compris- 
ing a timer for calculating the interval of updating of 
reference image provided in the reference image 
updating unit. 

7. An interface apparatus comprising contour depict- 
ing means composed of: 

at least an image pickup unit, 
a motion recognizing unit for recognizing the 
shape and/or move of the hand of the user in 
the taken image, and 

a display unit for displaying the shape and/or 
move of the hand of the user recognized by the 
motion recognizing unit, thereby 
depicting the contour of the taken user image, 
a contour waveform operation unit for tracing 
the depicted contour, and calculating the rela- 
tion between the angle of the contour line and 
the length of contour line, that is, the contour 
waveform, and a 

shape filter for filtering the contour waveform 
calculated by the contour waveform operation 
unit for generating a shape waveform express- 
ing a specific shape, whereby composing the 
motion recognizing unit. 

8. An interface apparatus of claim 7, wherein plural 
shape filters are composed by plural band pass fil- 
ters differing in band, and the motion of the user is 
judged on the basis of the shape waveform gener- 
ated by the plural shape filters. 

9. An interface apparatus of claim 7, wherein plural 
shape filters are composed of at least a band pass 
filter of contour waveform shape corresponding to 
undulations of hand, and a band pass filter of con- 
tour waveform shape corresponding to undulations 
of fingers. 

10. An interface apparatus of claim 7, wherein the 
motion recognizing unit is composed by comprising 
a coordinate table for storing the contrast of the 
coordinates of the contour shape of the taken user 
image and the contour waveform calculated by the 
contour shape operation unit, and a coordinate 
operation unit for calculating the coordinates of the 
location of the specified shape in the taken image 
by using the wave crest location of the shape wave- 
form and the coordinate table. 

11. An interface apparatus of claim 7, wherein the 
motion recognizing unit is composed by containing- 
a shape judging unit for counting the number of 
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recognizing means for recognizing the shape 
of a hand of an operator, is 
display means for displaying the features of the 
shape of the hand recognized by the recogniz- 
ing means on the screen as a special shape, 
and 

control means for controlling the information 20 
displayed in the screen by the special shape 
displayed in the screen by the display means. 
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pulses in the shape waveform generated by the 
shape filter, and the shape of the object is judged by 
the output value of the shape judging unit. 

12. An interface apparatus of claim 7, wherein the 
motion recognizing unit is composed by containing 
differentiating devices for differentiating the shape 
waveforms generated by the shape filter. 

13. An interface apparatus comprising: 
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display means, 

input means for changing the position and 
shape of the cursor displayed in the display 
means, is 
cursor memory means for storing coordinates 
of a representative point representing the posi- 
tion of the cursor and the shape of the cursor, 
object memory means for storing coordinates 
of a representative point representing the posi- 20 
tion of display object other than the cursor and 
shape of the display object, and 
interaction judging means for judging interac- 
tion between the cursor and the display object, 
by using the position and shape of the cursor 25 
stored in the cursor memory means and the 
position and shape of the display object stored 
in the object memory means. 

wherein the interaction judging means is 
composed of 30 

distance calculating means for calculating 
the distance between at least one repre- 
sentative point of the cursor and at least 
one representative point of the display 35 
object, 

motion recognizing means for recognizing 
the move of the cursor or change of the 
shape, and 

overall judging means for determining the 40 
interaction of the cursor and display object 
by using the distance calculated by the dis- 
tance calculating means and the result of 
recognition by the motion recognizing 
means. 45 

14. An interface apparatus of claim 13, wherein when 
the motion recognizing means recognizes a prelim- 
inarily registered motion, the interaction judging 
means induces an interaction on the display object so 
of which distance calculated by the distance calcu- 
lating means is below a predetermined reference. 

15. An interface apparatus of claim 13, wherein by 
installing move vector calculating means for calcu- 55 
lating the moving direction and moving distance of 
the cursor in the display space, the interaction judg- 
ing means is composed, and the interaction judging 



means determines the interaction of the cursor and 
display object by the interaction judging means on 
the basis of the moving direction of the cursor and 
moving distance of the cursor calculated by the 
move direction calculating means. 

16. An interface apparatus of claim 15, wherein the 
interaction judging means generates an interaction 
when the cursor moving distance calculated by the 
move vector calculating means is less than the pre- 
determined reference value. 

17. An interface apparatus of claim 15, wherein the 
interaction judging means generates an interaction 
on the display object existing near the extension 
line in the moving direction of the cursor calculated 
by the move vector calculating means. 

18. An interface apparatus of claim 13, wherein the 
interaction judging means generates an interaction 
when the shape of the cursor and shape of the dis- 
play object become a preliminarily registered com- 
bination. 

19. An interface apparatus of claim 13, wherein by 
composing the interaction judging means by incor- 
porating shape judging means for recognizing the 
shape of the cursor and shape of the display object, 
the interaction judging means generates an interac- 
tion when the shape of the cursor and shape of the 
display object recognized by the shape recognizing 
means coincide with each other. 

20. An interface apparatus of claim 13, wherein by 
comprising sight line input means for detecting 
sight light direction, the interaction judging means 
generates an interaction when the motion recogniz- 
ing means recognizes a preliminarily registered 
motion, on the display object near the extension line 
of the sight light detected by the sight line input 
means. 

21. An interface apparatus of claim 20, wherein the 
interaction judging means generates an interaction 
when the cursor is present near the extension line 
of the sight line on the display object near the exten- 
sion line of the sight line detected by the sight line 
input means and the motion recognizing means 
recognizes a preliminarily registered motion. 

22. An interface apparatus of claim 13, wherein when 
an interaction is generated, learning means may be 
provided for learning the configuration of the cursor 
and the objective display object, and the shape of 
the cursor and shape of the display object, so that 
the interaction judging means determines the inter- 
action on the basis of the learning result of the 
learning means. 
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23. An interface apparatus of claim 22, wherein the 
interaction judging means generates an interaction 
when the configuration of the cursor and objective 
display object or the shape of the cursor and shape 

of the display object may be similar to the configu- 5 
ration or shapes learned in the past by the learning 
means. 

24. An interface apparatus of claim 13, wherein the 
interaction judging means may be composed by 10 
incorporating coordinate transforming means for 
transforming the coordinates from the cursor mem- 
ory unit and object memory unit to the input to the 
distance calculating means. 

15 

25. An interface apparatus of claim 24, wherein the cur- 
sor and objective display object may be brought 
closer to each other when an interaction is gener- 
ated. 
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